Problem Statement¶

Business Context¶

Workplace safety in hazardous environments like construction sites and industrial plants is crucial to prevent accidents and injuries. One of the most important safety measures is ensuring workers wear safety helmets, which protect against head injuries from falling objects and machinery. Non-compliance with helmet regulations increases the risk of serious injuries or fatalities, making effective monitoring essential, especially in large-scale operations where manual oversight is prone to errors and inefficiency.

To overcome these challenges, SafeGuard Corp plans to develop an automated image analysis system capable of detecting whether workers are wearing safety helmets. This system will improve safety enforcement, ensuring compliance and reducing the risk of head injuries. By automating helmet monitoring, SafeGuard aims to enhance efficiency, scalability, and accuracy, ultimately fostering a safer work environment while minimizing human error in safety oversight.

Objective¶

As a data scientist at SafeGuard Corp, you are tasked with developing an image classification model that classifies images into one of two categories:

  • With Helmet: Workers wearing safety helmets.
  • Without Helmet: Workers not wearing safety helmets.

Data Description¶

The dataset consists of 631 images, equally divided into two categories:

  • With Helmet: 311 images showing workers wearing helmets.
  • Without Helmet: 320 images showing workers not wearing helmets.

Dataset Characteristics:

  • Variations in Conditions: Images include diverse environments such as construction sites, factories, and industrial settings, with variations in lighting, angles, and worker postures to simulate real-world conditions.
  • Worker Activities: Workers are depicted in different actions such as standing, using tools, or moving, ensuring robust model learning for various scenarios.

Installing and Importing the Necessary Libraries¶

In [9]:
%pip install numpy pandas matplotlib seaborn scikit-learn opencv-python tensorflow keras pillow  -q
Note: you may need to restart the kernel to use updated packages.
In [10]:
import tensorflow as tf
print("Num GPUs Available:", len(tf.config.list_physical_devices('GPU')))
print(tf.__version__)
Num GPUs Available: 0
2.20.0

Note:

  • After running the above cell, kindly restart the notebook kernel (for Jupyter Notebook) or runtime (for Google Colab) and run all cells sequentially from the next cell.

  • On executing the above line of code, you might see a warning regarding package dependencies. This error message can be ignored as the above code ensures that all necessary libraries and their dependencies are maintained to successfully execute the code in this notebook.

In [11]:
import os
import random
import numpy as np                                                                               # Importing numpy for Matrix Operations
import pandas as pd
import seaborn as sns
import matplotlib.image as mpimg                                                                              # Importing pandas to read CSV files
import matplotlib.pyplot as plt                                                                  # Importting matplotlib for Plotting and visualizing images
import math                                                                                      # Importing math module to perform mathematical operations
import cv2


# Tensorflow modules
import keras
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator                              # Importing the ImageDataGenerator for data augmentation
from tensorflow.keras.models import Sequential                                                   # Importing the sequential module to define a sequential model
from tensorflow.keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPooling2D,BatchNormalization # Defining all the layers to build our CNN Model
from tensorflow.keras.optimizers import Adam,SGD                                                 # Importing the optimizers which can be used in our model
from sklearn import preprocessing                                                                # Importing the preprocessing module to preprocess the data
from sklearn.model_selection import train_test_split                                             # Importing train_test_split function to split the data into train and test
from sklearn.metrics import confusion_matrix
from tensorflow.keras.models import Model
from keras.applications.vgg16 import VGG16                                               # Importing confusion_matrix to plot the confusion matrix

# Display images using OpenCV
# from google.colab.patches import cv2_imshow

#Imports functions for evaluating the performance of machine learning models
from sklearn.metrics import confusion_matrix, f1_score,accuracy_score, recall_score, precision_score, classification_report
from sklearn.metrics import mean_squared_error as mse                                                 # Importing cv2_imshow from google.patches to display images

# Ignore warnings
import warnings
warnings.filterwarnings('ignore')
In [12]:
# Set the seed using keras.utils.set_random_seed. This will set:
# 1) `numpy` seed
# 2) backend random seed
# 3) `python` random seed
tf.keras.utils.set_random_seed(812)

Data Overview¶

Loading the data¶

In [41]:
images = np.load('images_proj.npy')
labels = pd.read_csv('labels_proj.csv')
In [42]:
print(f'Images shape: {images.shape}')
print(f'Labels shape: {labels.shape}')

print(labels.value_counts())
Images shape: (631, 200, 200, 3)
Labels shape: (631, 1)
Label
0        320
1        311
Name: count, dtype: int64
In [60]:
print("minimum value of the image array is ",np.min(images[0]))
print("maximum value of the image array is",np.max(images[0]))
minimum value of the image array is  0
maximum value of the image array is 255

Observations¶

  • The no of images given to train and test split are less --> 631
  • The image counts can be considered as balanced 320, 311

Exploratory Data Analysis¶

Plot random images from each of the classes and print their corresponding labels.¶

In [48]:
def plot_sample_images(images, labels, num_samples=4):
    with_helmet_indices = labels[labels['Label'] == 1].index.tolist()
    without_helmet_indices = labels[labels['Label'] == 0].index.tolist()
    plt.figure(figsize=(12, 6))
    for i in range(num_samples):
        idx = random.choice(with_helmet_indices)
        plt.subplot(2, num_samples, i + 1)
        plt.imshow(images[idx])
        plt.title('With Helmet')
        plt.axis('off')
    
    # Plot random images without helmet
    for i in range(num_samples):
        idx = random.choice(without_helmet_indices)
        plt.subplot(2, num_samples, num_samples + i + 1)
        plt.imshow(images[idx])
        plt.title('Without Helmet')
        plt.axis('off')
    
    plt.tight_layout()
    plt.show()

# Display random samples of images with and without helmets
plot_sample_images(images,labels)
No description has been provided for this image
Observation¶
  • the data sample looks to be very simple, as without helmet images are mostly closeups to face

did run this sample multiple times and saw that almost all images without helmet are closeups.

Checking for class imbalance¶

In [49]:
# Count the number of images in each class
class_distribution = labels['Label'].value_counts()

# Create a bar plot
plt.figure(figsize=(10, 6))
sns.barplot(x=class_distribution.index, y=class_distribution.values)
plt.title('Distribution of Classes')
plt.xlabel('Class (0: Without Helmet, 1: With Helmet)')
plt.ylabel('Number of Images')

# Add value labels on top of each bar
for i, v in enumerate(class_distribution.values):
    plt.text(i, v, str(v), ha='center', va='bottom')

plt.show()

# Calculate the percentage distribution
percentage_distribution = (class_distribution / len(labels) * 100).round(2)
print("\nPercentage Distribution:")
for class_label, percentage in percentage_distribution.items():
    print(f"Class {class_label}: {percentage}%")
No description has been provided for this image
Percentage Distribution:
Class 0: 50.71%
Class 1: 49.29%
Observations¶
  • As observed earlier, we are seeing that the distribution is fairly even
  • the data sample looks to be very simple, as without helmet images are mostly closeups to face
  • The no of images given to train and test split are less --> 631
  • The image counts can be considered as balanced 320, 311
  • The data need to be normalized to use as values are between 0 and 255

Data Preprocessing¶

Converting images to grayscale¶

In [50]:
# Function to convert RGB images to grayscale
def convert_to_grayscale(images):
    gray_images = []
    for img in images:
        # Convert RGB to grayscale using cv2
        gray_img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
        # Add channel dimension for CNN input (H, W, 1)
        gray_img = gray_img[..., np.newaxis]
        gray_images.append(gray_img)
    return np.array(gray_images)

# Convert images to grayscale
gray_images = convert_to_grayscale(images)

# Display sample images before and after conversion
plt.figure(figsize=(12, 6))
for i in range(3):
    # Original RGB image
    plt.subplot(2, 3, i + 1)
    plt.imshow(images[i])
    plt.title('Original RGB')
    plt.axis('off')
    
    # Grayscale image
    plt.subplot(2, 3, i + 4)
    plt.imshow(gray_images[i].squeeze(), cmap='gray')
    plt.title('Grayscale')
    plt.axis('off')

plt.tight_layout()
plt.show()

print("Original image shape:", images[0].shape)
print("Grayscale image shape:", gray_images[0].shape)
No description has been provided for this image
Original image shape: (200, 200, 3)
Grayscale image shape: (200, 200, 1)
In [51]:
# Apply Gaussian blur to the grayscale images
def apply_gaussian_blur(images, kernel_size=(5,5)):
    blurred_images = []
    for img in images:
        # Apply Gaussian blur
        blurred = cv2.GaussianBlur(img, kernel_size, 0)
        blurred_images.append(blurred)
    return np.array(blurred_images)

# Apply blur to grayscale images
blurred_images = apply_gaussian_blur(gray_images)
In [52]:
# Apply Laplacian edge detection to the grayscale images
def apply_laplacian(images, ksize=3):
    laplacian_images = []
    for img in images:
        # Apply Gaussian blur first to reduce noise
        blurred = cv2.GaussianBlur(img, (5,5), 0)
        # Apply Laplacian
        laplacian = cv2.Laplacian(blurred, cv2.CV_64F, ksize=ksize)
        # Convert back to uint8 and normalize to 0-255 range
        laplacian = np.uint8(np.absolute(laplacian))
        laplacian_images.append(laplacian)
    return np.array(laplacian_images)

# Apply Laplacian edge detection on grayscale images
laplacian_images = apply_laplacian(gray_images)

# Display sample images to compare all preprocessing steps
plt.figure(figsize=(15, 8))
for i in range(3):
    # Original RGB image
    plt.subplot(4, 3, i + 1)
    plt.imshow(images[i])
    plt.title('Original RGB')
    plt.axis('off')
    
    # Grayscale image
    plt.subplot(4, 3, i + 4)
    plt.imshow(gray_images[i].squeeze(), cmap='gray')
    plt.title('Grayscale')
    plt.axis('off')
    
    # Blurred image
    plt.subplot(4, 3, i + 7)
    plt.imshow(blurred_images[i].squeeze(), cmap='gray')
    plt.title('Gaussian Blur')
    plt.axis('off')
    
    # Laplacian image
    plt.subplot(4, 3, i + 10)
    plt.imshow(laplacian_images[i].squeeze(), cmap='gray')
    plt.title('Laplacian Edge Detection')
    plt.axis('off')

plt.tight_layout()
plt.show()

print("Original image shape:", images[0].shape)
print("Grayscale image shape:", gray_images[0].shape)
print("Blurred image shape:", blurred_images[0].shape)
print("Laplacian image shape:", laplacian_images[0].shape)
No description has been provided for this image
Original image shape: (200, 200, 3)
Grayscale image shape: (200, 200, 1)
Blurred image shape: (200, 200)
Laplacian image shape: (200, 200)
preparing images¶
  • we can use gray scale or blurred images if we need to improve the performance fo the model.
  • later stages of the project we can decide to use these images if needed.
  • General observations based on above images, we can see edges and curves in various styles of the image filters

Splitting the dataset¶

In [66]:
# images, gray_images, blurred_images, laplacian_images
# splitting the data into training and testing sets , since there are only 631 samples, we are using 60% for training, 20% for validation and 20% for testing
# splitting rgb images
x_train_rgb, x_temp_rgb, y_train_rgb, y_temp_rgb = train_test_split(images, labels['Label'], test_size=0.4, random_state=42, stratify=labels['Label'])
x_val_rgb, x_test_rgb, y_val_rgb, y_test_rgb = train_test_split(x_temp_rgb, y_temp_rgb, test_size=0.5, random_state=42, stratify=y_temp_rgb)
In [ ]:
# splitting grayscale images
#x_train_gray, x_temp_gray, y_train_gray, y_temp_gray = train_test_split(gray_images, labels['Label'], test_size=0.4, random_state=42, stratify=labels['Label'])
#x_val_gray, x_test_gray, y_val_gray, y_test_gray = train_test_split(x_temp_gray, y_temp_gray, test_size=0.5, random_state=42, stratify=y_temp_gray)
In [ ]:
# splitting blurred images
#x_train_blur, x_temp_blur, y_train_blur, y_temp_blur = train_test_split(blurred_images, labels['Label'], test_size=0.4, random_state=42, stratify=labels['Label'])
#x_val_blur, x_test_blur, y_val_blur, y_test_blur = train_test_split(x_temp_blur, y_temp_blur, test_size=0.5, random_state=42, stratify=y_temp_blur)
In [ ]:
# splitting laplacian images
#x_train_lap, x_temp_lap, y_train_lap, y_temp_lap = train_test_split(laplacian_images, labels['Label'], test_size=0.4, random_state=42, stratify=labels['Label'])
#x_val_lap, x_test_lap, y_val_lap, y_test_lap = train_test_split(x_temp_lap, y_temp_lap, test_size=0.5, random_state=42, stratify=y_temp_lap)
  • We can also try ANN route to experiment with by using different filters (commented as these are out of scope for this assignment)

Data Normalization¶

In [70]:
# label binarizer is not needed as the labels are already in binary format (0 and 1)
# as we observed earlier in data exploration the values are between 0 and 255, lets normalize them
# normalizing rgb images
x_train_normalized_rgb = x_train_rgb.astype('float32') / 255.0
x_val_normalized_rgb = x_val_rgb.astype('float32') / 255.0
x_test_normalized_rgb = x_test_rgb.astype('float32') / 255.0
In [ ]:
# normalizing grayscale images
# x_train_normalized_gray = x_train_gray.astype('float32') / 255.0
# x_val_normalized_gray = x_val_gray.astype('float32') / 255.0
# x_test_normalized_gray = x_test_gray.astype('float32') / 255.0
In [ ]:
# normalizing blurred images
# x_train_normalized_blur = x_train_blur.astype('float32') / 255.0
# x_val_normalized_blur = x_val_blur.astype('float32') / 255.0
# x_test_normalized_blur = x_test_blur.astype('float32') / 255.0
In [ ]:
# normalizing laplacian images
# x_train_normalized_lap = x_train_lap.astype('float32') / 255.0
# x_val_normalized_lap = x_val_lap.astype('float32') / 255.0
# x_test_normalized_lap = x_test_lap.astype('float32') / 255.0
  • Normalized the data sets to train the models

  • We can also try ANN instead of CNN using the above converted images (out of scope for this assignment)

  • we will build all our models now for RGB .

Model Building¶

Model Evaluation Criterion¶

Utility Functions¶

In [74]:
# defining a function to compute different metrics to check performance of a classification model built using statsmodels
def model_performance_classification(model, predictors, target):
    """
    Function to compute different metrics to check classification model performance

    model: classifier
    predictors: independent variables
    target: dependent variable
    """

    # checking which probabilities are greater than threshold
    pred = model.predict(predictors).reshape(-1)>0.5

    target = target.to_numpy().reshape(-1)


    acc = accuracy_score(target, pred)  # to compute Accuracy
    recall = recall_score(target, pred, average='weighted')  # to compute Recall
    precision = precision_score(target, pred, average='weighted')  # to compute Precision
    f1 = f1_score(target, pred, average='weighted')  # to compute F1-score

    # creating a dataframe of metrics
    df_perf = pd.DataFrame({"Accuracy": acc, "Recall": recall, "Precision": precision, "F1 Score": f1,},index=[0],)

    return df_perf
In [75]:
def plot_confusion_matrix(model,predictors,target,ml=False):
    """
    Function to plot the confusion matrix

    model: classifier
    predictors: independent variables
    target: dependent variable
    ml: To specify if the model used is an sklearn ML model or not (True means ML model)
    """

    # checking which probabilities are greater than threshold
    pred = model.predict(predictors).reshape(-1)>0.5

    target = target.to_numpy().reshape(-1)

    # Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
    confusion_matrix = tf.math.confusion_matrix(target,pred)
    f, ax = plt.subplots(figsize=(10, 8))
    sns.heatmap(
        confusion_matrix,
        annot=True,
        linewidths=.4,
        fmt="d",
        square=True,
        ax=ax
    )
    plt.show()

Model 1: Simple Convolutional Neural Network (CNN)¶

In [76]:
# Basic CNN model for helmet detection (using RGB images)
cnn_model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=x_train_rgb.shape[1:]),#200,200,3
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Flatten(),
    Dense(64, activation='relu'),
    Dropout(0.5),
    Dense(1, activation='sigmoid')
])

cnn_model.compile(optimizer=Adam(learning_rate=0.001),
                  loss='binary_crossentropy',
                  metrics=['accuracy'])

cnn_model.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D)                 │ (None, 198, 198, 32)   │           896 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d (MaxPooling2D)    │ (None, 99, 99, 32)     │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_1 (Conv2D)               │ (None, 97, 97, 64)     │        18,496 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_1 (MaxPooling2D)  │ (None, 48, 48, 64)     │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten (Flatten)               │ (None, 147456)         │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense)                   │ (None, 64)             │     9,437,248 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 64)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 1)              │            65 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 9,456,705 (36.07 MB)
 Trainable params: 9,456,705 (36.07 MB)
 Non-trainable params: 0 (0.00 B)
In [77]:
# Fit the basic CNN model using normalized RGB images
history_basic_cnn = cnn_model.fit(
    x_train_normalized_rgb, y_train_rgb,
    epochs=20,
    batch_size=32,
    validation_data=(x_val_normalized_rgb, y_val_rgb),
    verbose=2,
    shuffle=True
)
Epoch 1/20
12/12 - 3s - 278ms/step - accuracy: 0.6534 - loss: 1.3288 - val_accuracy: 0.9206 - val_loss: 0.2089
Epoch 2/20
12/12 - 3s - 278ms/step - accuracy: 0.6534 - loss: 1.3288 - val_accuracy: 0.9206 - val_loss: 0.2089
Epoch 2/20
12/12 - 2s - 197ms/step - accuracy: 0.9656 - loss: 0.1129 - val_accuracy: 0.9524 - val_loss: 0.1005
Epoch 3/20
12/12 - 2s - 197ms/step - accuracy: 0.9656 - loss: 0.1129 - val_accuracy: 0.9524 - val_loss: 0.1005
Epoch 3/20
12/12 - 2s - 192ms/step - accuracy: 0.9709 - loss: 0.0821 - val_accuracy: 0.9683 - val_loss: 0.0765
Epoch 4/20
12/12 - 2s - 192ms/step - accuracy: 0.9709 - loss: 0.0821 - val_accuracy: 0.9683 - val_loss: 0.0765
Epoch 4/20
12/12 - 2s - 192ms/step - accuracy: 0.9894 - loss: 0.0298 - val_accuracy: 0.9762 - val_loss: 0.0407
Epoch 5/20
12/12 - 2s - 192ms/step - accuracy: 0.9894 - loss: 0.0298 - val_accuracy: 0.9762 - val_loss: 0.0407
Epoch 5/20
12/12 - 2s - 200ms/step - accuracy: 0.9974 - loss: 0.0134 - val_accuracy: 0.9762 - val_loss: 0.0539
Epoch 6/20
12/12 - 2s - 200ms/step - accuracy: 0.9974 - loss: 0.0134 - val_accuracy: 0.9762 - val_loss: 0.0539
Epoch 6/20
12/12 - 2s - 189ms/step - accuracy: 0.9974 - loss: 0.0165 - val_accuracy: 0.9841 - val_loss: 0.0171
Epoch 7/20
12/12 - 2s - 189ms/step - accuracy: 0.9974 - loss: 0.0165 - val_accuracy: 0.9841 - val_loss: 0.0171
Epoch 7/20
12/12 - 2s - 188ms/step - accuracy: 1.0000 - loss: 0.0072 - val_accuracy: 1.0000 - val_loss: 0.0040
Epoch 8/20
12/12 - 2s - 188ms/step - accuracy: 1.0000 - loss: 0.0072 - val_accuracy: 1.0000 - val_loss: 0.0040
Epoch 8/20
12/12 - 2s - 189ms/step - accuracy: 0.9974 - loss: 0.0062 - val_accuracy: 0.9841 - val_loss: 0.0360
Epoch 9/20
12/12 - 2s - 189ms/step - accuracy: 0.9974 - loss: 0.0062 - val_accuracy: 0.9841 - val_loss: 0.0360
Epoch 9/20
12/12 - 2s - 189ms/step - accuracy: 1.0000 - loss: 0.0015 - val_accuracy: 1.0000 - val_loss: 0.0031
Epoch 10/20
12/12 - 2s - 189ms/step - accuracy: 1.0000 - loss: 0.0015 - val_accuracy: 1.0000 - val_loss: 0.0031
Epoch 10/20
12/12 - 2s - 187ms/step - accuracy: 1.0000 - loss: 0.0023 - val_accuracy: 0.9841 - val_loss: 0.0324
Epoch 11/20
12/12 - 2s - 187ms/step - accuracy: 1.0000 - loss: 0.0023 - val_accuracy: 0.9841 - val_loss: 0.0324
Epoch 11/20
12/12 - 2s - 188ms/step - accuracy: 1.0000 - loss: 0.0022 - val_accuracy: 0.9921 - val_loss: 0.0105
Epoch 12/20
12/12 - 2s - 188ms/step - accuracy: 1.0000 - loss: 0.0022 - val_accuracy: 0.9921 - val_loss: 0.0105
Epoch 12/20
12/12 - 2s - 187ms/step - accuracy: 1.0000 - loss: 0.0034 - val_accuracy: 0.9841 - val_loss: 0.0255
Epoch 13/20
12/12 - 2s - 187ms/step - accuracy: 1.0000 - loss: 0.0034 - val_accuracy: 0.9841 - val_loss: 0.0255
Epoch 13/20
12/12 - 2s - 190ms/step - accuracy: 0.9947 - loss: 0.0154 - val_accuracy: 0.9841 - val_loss: 0.0510
Epoch 14/20
12/12 - 2s - 190ms/step - accuracy: 0.9947 - loss: 0.0154 - val_accuracy: 0.9841 - val_loss: 0.0510
Epoch 14/20
12/12 - 2s - 188ms/step - accuracy: 0.9947 - loss: 0.0099 - val_accuracy: 0.9683 - val_loss: 0.1182
Epoch 15/20
12/12 - 2s - 188ms/step - accuracy: 0.9947 - loss: 0.0099 - val_accuracy: 0.9683 - val_loss: 0.1182
Epoch 15/20
12/12 - 2s - 189ms/step - accuracy: 1.0000 - loss: 0.0013 - val_accuracy: 0.9841 - val_loss: 0.0173
Epoch 16/20
12/12 - 2s - 189ms/step - accuracy: 1.0000 - loss: 0.0013 - val_accuracy: 0.9841 - val_loss: 0.0173
Epoch 16/20
12/12 - 2s - 197ms/step - accuracy: 1.0000 - loss: 0.0020 - val_accuracy: 0.9762 - val_loss: 0.0607
Epoch 17/20
12/12 - 2s - 197ms/step - accuracy: 1.0000 - loss: 0.0020 - val_accuracy: 0.9762 - val_loss: 0.0607
Epoch 17/20
12/12 - 2s - 193ms/step - accuracy: 1.0000 - loss: 4.1857e-04 - val_accuracy: 1.0000 - val_loss: 9.4013e-04
Epoch 18/20
12/12 - 2s - 193ms/step - accuracy: 1.0000 - loss: 4.1857e-04 - val_accuracy: 1.0000 - val_loss: 9.4013e-04
Epoch 18/20
12/12 - 2s - 186ms/step - accuracy: 1.0000 - loss: 3.4337e-04 - val_accuracy: 0.9841 - val_loss: 0.0445
Epoch 19/20
12/12 - 2s - 186ms/step - accuracy: 1.0000 - loss: 3.4337e-04 - val_accuracy: 0.9841 - val_loss: 0.0445
Epoch 19/20
12/12 - 2s - 187ms/step - accuracy: 0.9974 - loss: 0.0101 - val_accuracy: 0.9444 - val_loss: 0.3482
Epoch 20/20
12/12 - 2s - 187ms/step - accuracy: 0.9974 - loss: 0.0101 - val_accuracy: 0.9444 - val_loss: 0.3482
Epoch 20/20
12/12 - 2s - 184ms/step - accuracy: 0.9947 - loss: 0.0148 - val_accuracy: 0.9762 - val_loss: 0.0700
12/12 - 2s - 184ms/step - accuracy: 0.9947 - loss: 0.0148 - val_accuracy: 0.9762 - val_loss: 0.0700
In [92]:
def plot_training_history(history):
    """
    Function to plot training and validation accuracy and loss
    history: History object returned by model.fit()
    """
    # Plot training and validation accuracy and loss
    plt.figure(figsize=(14, 5))
    plt.subplot(1, 2, 1)
    plt.plot(history.history['accuracy'], label='Train Accuracy')
    plt.plot(history.history['val_accuracy'], label='Val Accuracy')
    plt.title('Model Accuracy')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')
    plt.legend()

    plt.subplot(1, 2, 2)
    plt.plot(history.history['loss'], label='Train Loss')
    plt.plot(history.history['val_loss'], label='Val Loss')
    plt.title('Model Loss')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend()

    plt.tight_layout()
    plt.show()

plot_training_history(history_basic_cnn)
No description has been provided for this image
In [102]:
performance_test_basic_cnn = model_performance_classification(cnn_model, x_test_normalized_rgb, y_test_rgb)
print("Performance of Basic CNN Model on Test set of RGB Images:")
print(performance_test_basic_cnn)
4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 94ms/step
Performance of Basic CNN Model on Test set of RGB Images:
   Accuracy    Recall  Precision  F1 Score
0  0.968504  0.968504   0.968504  0.968504
In [82]:
# Plot confusion matrix for test set predictions
plot_confusion_matrix(cnn_model, x_test_normalized_rgb, y_test_rgb, ml=True)
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 45ms/step
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 45ms/step
No description has been provided for this image
In [101]:
performance_val_basic_cnn = model_performance_classification(cnn_model, x_val_normalized_rgb, y_val_rgb)
print("Performance of Basic CNN Model on val set of RGB Images:")
print(performance_val_basic_cnn)
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 108ms/step
Performance of Basic CNN Model on val set of RGB Images:
   Accuracy   Recall  Precision  F1 Score
0   0.97619  0.97619   0.977257  0.976168
In [86]:
# Plot confusion matrix for val set predictions
plot_confusion_matrix(cnn_model, x_val_normalized_rgb, y_val_rgb, ml=True)
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 61ms/step
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 61ms/step
No description has been provided for this image
In [146]:
performance_train_basic_cnn = model_performance_classification(cnn_model, x_train_normalized_rgb, y_train_rgb)
print("Performance of Basic CNN Model on Training set of RGB Images:")
print(performance_train_basic_cnn)
12/12 ━━━━━━━━━━━━━━━━━━━━ 1s 43ms/step
Performance of Basic CNN Model on Training set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
12/12 ━━━━━━━━━━━━━━━━━━━━ 1s 43ms/step
Performance of Basic CNN Model on Training set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
In [147]:
# Plot confusion matrix for val set predictions
plot_confusion_matrix(cnn_model, x_train_normalized_rgb, y_train_rgb, ml=True)
12/12 ━━━━━━━━━━━━━━━━━━━━ 1s 47ms/step
12/12 ━━━━━━━━━━━━━━━━━━━━ 1s 47ms/step
No description has been provided for this image

Vizualizing the predictions¶

In [110]:
def plot_sample_predictions_on_val_set(model):   
     # Visualize predictions on val images
    num_samples = 8
    plt.figure(figsize=(16, 8))
    pred_probs = model.predict(x_val_normalized_rgb[:num_samples])
    pred_labels = (pred_probs > 0.5).astype(int).reshape(-1)
    for i in range(num_samples):
        plt.subplot(2, num_samples//2, i+1)
        plt.imshow(x_val_rgb[i])
        true_label = y_val_rgb.iloc[i] if hasattr(y_val_rgb, 'iloc') else y_val_rgb[i]
        plt.title(f"True: {'Helmet' if true_label==1 else 'No Helmet'}\nPred: {'Helmet' if pred_labels[i]==1 else 'No Helmet'}")
        plt.axis('off')
    plt.tight_layout()
    plt.show()

def plot_sample_predictions_on_test_set(model):   
     # Visualize predictions on val images
    num_samples = 8
    plt.figure(figsize=(16, 8))
    pred_probs = model.predict(x_test_normalized_rgb[:num_samples])
    pred_labels = (pred_probs > 0.5).astype(int).reshape(-1)
    for i in range(num_samples):
        plt.subplot(2, num_samples//2, i+1)
        plt.imshow(x_test_rgb[i])
        true_label = y_test_rgb.iloc[i] if hasattr(y_test_rgb, 'iloc') else y_test_rgb[i]
        plt.title(f"True: {'Helmet' if true_label==1 else 'No Helmet'}\nPred: {'Helmet' if pred_labels[i]==1 else 'No Helmet'}")
        plt.axis('off')
    plt.tight_layout()
    plt.show()

plot_sample_predictions_on_val_set(cnn_model)

plot_sample_predictions_on_test_set(cnn_model)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step
No description has been provided for this image
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 25ms/step
No description has been provided for this image
  • A simple CNN peformed very well with the given data
  • lets try with a pretrained CNN VGG16 model to see how it works

Model 2: (VGG-16 (Base))¶

In [97]:
# VGG16-based model for helmet detection (using RGB images)
from tensorflow.keras.applications import VGG16
from tensorflow.keras.layers import GlobalAveragePooling2D
from tensorflow.keras.models import Model

# Load VGG16 base (without top, with imagenet weights)
vgg_base = VGG16(weights='imagenet', include_top=False, input_shape=x_train_rgb.shape[1:])

# Making all the layers of the VGG model non-trainable. i.e. freezing them
for layer in vgg_base.layers:
    layer.trainable = False

vgg_base.summary()

vgg16_model = Sequential() # Initializing the Sequential model
vgg16_model.add(vgg_base) # Adding the VGG16 base model
vgg16_model.add(Flatten())# Flattening the output of the VGG16 model
vgg16_model.add(Dense(1, activation='sigmoid'))

vgg16_model.compile(optimizer=Adam(learning_rate=0.001),
                    loss='binary_crossentropy',
                    metrics=['accuracy'])

vgg16_model.summary()
Model: "vgg16"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ input_layer_5 (InputLayer)      │ (None, 200, 200, 3)    │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block1_conv1 (Conv2D)           │ (None, 200, 200, 64)   │         1,792 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block1_conv2 (Conv2D)           │ (None, 200, 200, 64)   │        36,928 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block1_pool (MaxPooling2D)      │ (None, 100, 100, 64)   │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block2_conv1 (Conv2D)           │ (None, 100, 100, 128)  │        73,856 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block2_conv2 (Conv2D)           │ (None, 100, 100, 128)  │       147,584 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block2_pool (MaxPooling2D)      │ (None, 50, 50, 128)    │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block3_conv1 (Conv2D)           │ (None, 50, 50, 256)    │       295,168 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block3_conv2 (Conv2D)           │ (None, 50, 50, 256)    │       590,080 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block3_conv3 (Conv2D)           │ (None, 50, 50, 256)    │       590,080 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block3_pool (MaxPooling2D)      │ (None, 25, 25, 256)    │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block4_conv1 (Conv2D)           │ (None, 25, 25, 512)    │     1,180,160 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block4_conv2 (Conv2D)           │ (None, 25, 25, 512)    │     2,359,808 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block4_conv3 (Conv2D)           │ (None, 25, 25, 512)    │     2,359,808 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block4_pool (MaxPooling2D)      │ (None, 12, 12, 512)    │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block5_conv1 (Conv2D)           │ (None, 12, 12, 512)    │     2,359,808 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block5_conv2 (Conv2D)           │ (None, 12, 12, 512)    │     2,359,808 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block5_conv3 (Conv2D)           │ (None, 12, 12, 512)    │     2,359,808 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block5_pool (MaxPooling2D)      │ (None, 6, 6, 512)      │             0 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 14,714,688 (56.13 MB)
 Trainable params: 0 (0.00 B)
 Non-trainable params: 14,714,688 (56.13 MB)
Model: "sequential_2"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ vgg16 (Functional)              │ (None, 6, 6, 512)      │    14,714,688 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten_2 (Flatten)             │ (None, 18432)          │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_7 (Dense)                 │ (None, 1)              │        18,433 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 14,733,121 (56.20 MB)
 Trainable params: 18,433 (72.00 KB)
 Non-trainable params: 14,714,688 (56.13 MB)
In [113]:
trainDataGen = ImageDataGenerator()
In [149]:
# Fit the VGG16 model using normalized RGB images
history_vgg16 = vgg16_model.fit(trainDataGen.flow(x_train_normalized_rgb, y_train_rgb, batch_size=32,shuffle=False),
    epochs=20,
    validation_data=(x_val_normalized_rgb, y_val_rgb),
    verbose=2
)
Epoch 1/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 1.5255e-04 - val_accuracy: 1.0000 - val_loss: 0.0015
Epoch 2/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 1.4765e-04 - val_accuracy: 1.0000 - val_loss: 0.0014
Epoch 3/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 1.4267e-04 - val_accuracy: 1.0000 - val_loss: 0.0014
Epoch 4/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 1.3879e-04 - val_accuracy: 1.0000 - val_loss: 0.0014
Epoch 5/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 1.3487e-04 - val_accuracy: 1.0000 - val_loss: 0.0014
Epoch 6/20
12/12 - 22s - 2s/step - accuracy: 1.0000 - loss: 1.3036e-04 - val_accuracy: 1.0000 - val_loss: 0.0013
Epoch 7/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 1.2687e-04 - val_accuracy: 1.0000 - val_loss: 0.0013
Epoch 8/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 1.2333e-04 - val_accuracy: 1.0000 - val_loss: 0.0013
Epoch 9/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 1.1979e-04 - val_accuracy: 1.0000 - val_loss: 0.0013
Epoch 10/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 1.1641e-04 - val_accuracy: 1.0000 - val_loss: 0.0013
Epoch 11/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 1.1340e-04 - val_accuracy: 1.0000 - val_loss: 0.0013
Epoch 12/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 1.1028e-04 - val_accuracy: 1.0000 - val_loss: 0.0012
Epoch 13/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 1.0744e-04 - val_accuracy: 1.0000 - val_loss: 0.0012
Epoch 14/20
12/12 - 32s - 3s/step - accuracy: 1.0000 - loss: 1.0457e-04 - val_accuracy: 1.0000 - val_loss: 0.0012
Epoch 15/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 1.0233e-04 - val_accuracy: 1.0000 - val_loss: 0.0012
Epoch 16/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 9.9427e-05 - val_accuracy: 1.0000 - val_loss: 0.0012
Epoch 17/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 9.6941e-05 - val_accuracy: 1.0000 - val_loss: 0.0012
Epoch 18/20
12/12 - 24s - 2s/step - accuracy: 1.0000 - loss: 9.4766e-05 - val_accuracy: 1.0000 - val_loss: 0.0011
Epoch 19/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 9.2594e-05 - val_accuracy: 1.0000 - val_loss: 0.0012
Epoch 20/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 9.0056e-05 - val_accuracy: 1.0000 - val_loss: 0.0011
In [121]:
plot_training_history(history_vgg16)
No description has been provided for this image
In [150]:
performance_train_basic_vgg = model_performance_classification(vgg16_model, x_train_normalized_rgb, y_train_rgb)
print("Performance of Basic VGG Model on Training set of RGB Images:")
print(performance_train_basic_vgg)
12/12 ━━━━━━━━━━━━━━━━━━━━ 18s 1s/step
12/12 ━━━━━━━━━━━━━━━━━━━━ 18s 1s/step
Performance of Basic VGG Model on Training set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
Performance of Basic VGG Model on Training set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
In [ ]:
plot_confusion_matrix(vgg16_model, x_train_normalized_rgb, y_train_rgb, ml=True)
In [116]:
performance_val_basic_vgg = model_performance_classification(vgg16_model, x_val_normalized_rgb, y_val_rgb)
print("Performance of Basic CNN vgg16 Model on Val set of RGB Images:")
print(performance_val_basic_vgg)
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 1s/step
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 1s/step
Performance of Basic CNN vgg16 Model on Val set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
Performance of Basic CNN vgg16 Model on Val set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
In [117]:
plot_confusion_matrix(vgg16_model, x_val_normalized_rgb, y_val_rgb, ml=True)
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
No description has been provided for this image
In [118]:
# peroformance classification on test set
performance_test_basic_vgg = model_performance_classification(vgg16_model, x_test_normalized_rgb, y_test_rgb)
print("Performance of Basic CNN vgg16 Model on Test set of RGB Images:")
print(performance_test_basic_vgg)
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
Performance of Basic CNN vgg16 Model on Test set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
Performance of Basic CNN vgg16 Model on Test set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
In [119]:
# Plot confusion matrix for test set predictions
plot_confusion_matrix(vgg16_model, x_test_normalized_rgb, y_test_rgb, ml=True)
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
No description has been provided for this image

Visualizing the prediction:¶

In [120]:
print("sample visulization of predictions on Validation set ")

plot_sample_predictions_on_val_set(vgg16_model)
print("sample visulization of predictions on test set ")
plot_sample_predictions_on_test_set(vgg16_model)
sample visulization of predictions on Validation set 
1/1 ━━━━━━━━━━━━━━━━━━━━ 2s 2s/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 2s 2s/step
No description has been provided for this image
sample visulization of predictions on test set 
1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 946ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 946ms/step
No description has been provided for this image

Observations from VGG16 Model Results¶

  • The VGG16 transfer learning model achieved strong accuracy on both validation and test sets, indicating good generalization.

  • Training and validation accuracy curves show minimal overfitting, likely due to frozen base layers and regularization (Dropout).

  • The confusion matrix reveals that the model distinguishes well between 'Helmet' and 'No Helmet' classes.

  • VGG and basic CNN performed almost same for this usecase. VGG being pretrained can still be good choice to be ready for future scenarios

  • Further improvements could be made with adding fully connected layers, data augmentation, fine-tuning, or experimenting with other architectures.

Model 3: (VGG-16 (Base + FFNN))¶

In [122]:
# lets create a feed forward neural network using the extracted features from VGG16 model
vgg16_ffnn_model = Sequential()
vgg16_ffnn_model.add(vgg_base) # Adding the VGG16 base model
vgg16_ffnn_model.add(Flatten())# Flattening the output of the VGG 

# Adding fully connected layers
vgg16_ffnn_model.add(Dense(128, activation='relu'))
vgg16_ffnn_model.add(Dropout(0.5))
vgg16_ffnn_model.add(Dense(64, activation='relu'))

vgg16_ffnn_model.add(Dense(1, activation='sigmoid'))
In [123]:
vgg16_ffnn_model.compile(optimizer=Adam(learning_rate=0.001),
                    loss='binary_crossentropy',
                    metrics=['accuracy'])
vgg16_ffnn_model.summary()
Model: "sequential_3"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ vgg16 (Functional)              │ (None, 6, 6, 512)      │    14,714,688 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten_3 (Flatten)             │ (None, 18432)          │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_8 (Dense)                 │ (None, 128)            │     2,359,424 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_3 (Dropout)             │ (None, 128)            │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_9 (Dense)                 │ (None, 64)             │         8,256 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_10 (Dense)                │ (None, 1)              │            65 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 17,082,433 (65.16 MB)
 Trainable params: 2,367,745 (9.03 MB)
 Non-trainable params: 14,714,688 (56.13 MB)
In [124]:
# Fit the vgg16_ffnn_model  using normalized RGB images
history_vgg16_ffnn = vgg16_ffnn_model.fit(trainDataGen.flow(x_train_normalized_rgb, y_train_rgb, batch_size=32,shuffle=False),
    epochs=20,
    validation_data=(x_val_normalized_rgb, y_val_rgb),
    verbose=2
)
# re using the trainDataGen defined earlier for default data augmentation
Epoch 1/20
12/12 - 24s - 2s/step - accuracy: 0.8519 - loss: 0.3792 - val_accuracy: 1.0000 - val_loss: 0.0050
Epoch 2/20
12/12 - 24s - 2s/step - accuracy: 0.8519 - loss: 0.3792 - val_accuracy: 1.0000 - val_loss: 0.0050
Epoch 2/20
12/12 - 23s - 2s/step - accuracy: 0.9921 - loss: 0.0188 - val_accuracy: 1.0000 - val_loss: 0.0049
Epoch 3/20
12/12 - 23s - 2s/step - accuracy: 0.9921 - loss: 0.0188 - val_accuracy: 1.0000 - val_loss: 0.0049
Epoch 3/20
12/12 - 23s - 2s/step - accuracy: 0.9921 - loss: 0.0179 - val_accuracy: 1.0000 - val_loss: 2.8533e-04
Epoch 4/20
12/12 - 23s - 2s/step - accuracy: 0.9921 - loss: 0.0179 - val_accuracy: 1.0000 - val_loss: 2.8533e-04
Epoch 4/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 4.0035e-04 - val_accuracy: 1.0000 - val_loss: 2.2263e-04
Epoch 5/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 4.0035e-04 - val_accuracy: 1.0000 - val_loss: 2.2263e-04
Epoch 5/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 0.0019 - val_accuracy: 1.0000 - val_loss: 2.9632e-04
Epoch 6/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 0.0019 - val_accuracy: 1.0000 - val_loss: 2.9632e-04
Epoch 6/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 0.0015 - val_accuracy: 1.0000 - val_loss: 4.8067e-04
Epoch 7/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 0.0015 - val_accuracy: 1.0000 - val_loss: 4.8067e-04
Epoch 7/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 4.8532e-04 - val_accuracy: 1.0000 - val_loss: 5.0774e-04
Epoch 8/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 4.8532e-04 - val_accuracy: 1.0000 - val_loss: 5.0774e-04
Epoch 8/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 0.0011 - val_accuracy: 1.0000 - val_loss: 3.7928e-04
Epoch 9/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 0.0011 - val_accuracy: 1.0000 - val_loss: 3.7928e-04
Epoch 9/20
12/12 - 23s - 2s/step - accuracy: 0.9974 - loss: 0.0024 - val_accuracy: 1.0000 - val_loss: 1.9390e-04
Epoch 10/20
12/12 - 23s - 2s/step - accuracy: 0.9974 - loss: 0.0024 - val_accuracy: 1.0000 - val_loss: 1.9390e-04
Epoch 10/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 5.6826e-04 - val_accuracy: 1.0000 - val_loss: 3.3066e-04
Epoch 11/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 5.6826e-04 - val_accuracy: 1.0000 - val_loss: 3.3066e-04
Epoch 11/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 6.8775e-04 - val_accuracy: 1.0000 - val_loss: 1.1498e-04
Epoch 12/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 6.8775e-04 - val_accuracy: 1.0000 - val_loss: 1.1498e-04
Epoch 12/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 8.4927e-05 - val_accuracy: 1.0000 - val_loss: 9.1289e-05
Epoch 13/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 8.4927e-05 - val_accuracy: 1.0000 - val_loss: 9.1289e-05
Epoch 13/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 8.8464e-04 - val_accuracy: 1.0000 - val_loss: 1.3917e-04
Epoch 14/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 8.8464e-04 - val_accuracy: 1.0000 - val_loss: 1.3917e-04
Epoch 14/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 0.0014 - val_accuracy: 1.0000 - val_loss: 1.2696e-04
Epoch 15/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 0.0014 - val_accuracy: 1.0000 - val_loss: 1.2696e-04
Epoch 15/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 2.1061e-04 - val_accuracy: 1.0000 - val_loss: 7.8338e-05
Epoch 16/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 2.1061e-04 - val_accuracy: 1.0000 - val_loss: 7.8338e-05
Epoch 16/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 5.1372e-04 - val_accuracy: 1.0000 - val_loss: 6.7067e-05
Epoch 17/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 5.1372e-04 - val_accuracy: 1.0000 - val_loss: 6.7067e-05
Epoch 17/20
12/12 - 22s - 2s/step - accuracy: 1.0000 - loss: 3.7446e-04 - val_accuracy: 1.0000 - val_loss: 4.0564e-05
Epoch 18/20
12/12 - 22s - 2s/step - accuracy: 1.0000 - loss: 3.7446e-04 - val_accuracy: 1.0000 - val_loss: 4.0564e-05
Epoch 18/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 7.3079e-05 - val_accuracy: 1.0000 - val_loss: 3.4063e-05
Epoch 19/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 7.3079e-05 - val_accuracy: 1.0000 - val_loss: 3.4063e-05
Epoch 19/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 0.0010 - val_accuracy: 1.0000 - val_loss: 3.0404e-05
Epoch 20/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 0.0010 - val_accuracy: 1.0000 - val_loss: 3.0404e-05
Epoch 20/20
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 9.0353e-04 - val_accuracy: 1.0000 - val_loss: 3.3269e-05
12/12 - 23s - 2s/step - accuracy: 1.0000 - loss: 9.0353e-04 - val_accuracy: 1.0000 - val_loss: 3.3269e-05
In [129]:
plot_training_history(history_vgg16_ffnn)
No description has been provided for this image
In [125]:
# performance classification on validation set
performance_val_vgg16_ffnn = model_performance_classification(vgg16_ffnn_model, x_val_normalized_rgb, y_val_rgb)
print("Performance of VGG16 FFNN Model on Val set of RGB Images:")
print(performance_val_vgg16_ffnn)
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
Performance of VGG16 FFNN Model on Val set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
Performance of VGG16 FFNN Model on Val set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
In [126]:
# confusion matrix for validation set
plot_confusion_matrix(vgg16_ffnn_model, x_val_normalized_rgb, y_val_rgb, ml=True)
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
No description has been provided for this image
In [127]:
# performance classification on test set
performance_test_vgg16_ffnn = model_performance_classification(vgg16_ffnn_model, x_test_normalized_rgb, y_test_rgb)
print("Performance of VGG16 FFNN Model on Test set of RGB Images:")
print(performance_test_vgg16_ffnn)
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
Performance of VGG16 FFNN Model on Test set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
Performance of VGG16 FFNN Model on Test set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
In [128]:
# confusion matrix for test set
plot_confusion_matrix(vgg16_ffnn_model, x_test_normalized_rgb, y_test_rgb, ml=True)
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
No description has been provided for this image
In [151]:
performance_train_vgg16_ffnn = model_performance_classification(vgg16_ffnn_model, x_train_normalized_rgb, y_train_rgb)
print("Performance of Basic VGG Model with FFNN on Training set of RGB Images:")
print(performance_train_vgg16_ffnn)
12/12 ━━━━━━━━━━━━━━━━━━━━ 17s 1s/step
12/12 ━━━━━━━━━━━━━━━━━━━━ 17s 1s/step
Performance of Basic VGG Model with FFNN on Training set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
Performance of Basic VGG Model with FFNN on Training set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
In [152]:
# confusion matrix for train set
plot_confusion_matrix(vgg16_ffnn_model, x_train_normalized_rgb, y_train_rgb, ml=True)
12/12 ━━━━━━━━━━━━━━━━━━━━ 17s 1s/step
12/12 ━━━━━━━━━━━━━━━━━━━━ 17s 1s/step
No description has been provided for this image

Visualizing the predictions¶

In [130]:
print("sample visulization of predictions on Validation set ")

plot_sample_predictions_on_val_set(vgg16_ffnn_model)
print("sample visulization of predictions on test set ")
plot_sample_predictions_on_test_set(vgg16_ffnn_model)
sample visulization of predictions on Validation set 
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 403ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 403ms/step
No description has been provided for this image
sample visulization of predictions on test set 
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 434ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 434ms/step
No description has been provided for this image

Observations¶

  • VGG16-Base + FFNN performed well
  • This model performance is same as base model, perhaps may be because the size of the data .
  • These pretrain models are learning fast as the pictures are clearly distinguishable.

Model 4: (VGG-16 (Base + FFNN + Data Augmentation)¶

  • In most of the real-world case studies, it is challenging to acquire a large number of images and then train CNNs.

  • To overcome this problem, one approach we might consider is Data Augmentation.

  • CNNs have the property of translational invariance, which means they can recognise an object even if its appearance shifts translationally in some way. - Taking this attribute into account, we can augment the images using the techniques listed below

    • Horizontal Flip (should be set to True/False)
    • Vertical Flip (should be set to True/False)
    • Height Shift (should be between 0 and 1)
    • Width Shift (should be between 0 and 1)
    • Rotation (should be between 0 and 180)
    • Shear (should be between 0 and 1)
    • Zoom (should be between 0 and 1) etc.

Remember, data augmentation should not be used in the validation/test data set.

In [131]:
# lets create a feed forward neural network using the extracted features from VGG16 model (same as model-3 but for clarity i am defining again)
vgg16_ffnn_da_model = Sequential()
vgg16_ffnn_da_model.add(vgg_base) # Adding the VGG16 base model
vgg16_ffnn_da_model.add(Flatten())# Flattening the output of the VGG 

# Adding fully connected layers
vgg16_ffnn_da_model.add(Dense(128, activation='relu'))
vgg16_ffnn_da_model.add(Dropout(0.5))
vgg16_ffnn_da_model.add(Dense(64, activation='relu'))

vgg16_ffnn_da_model.add(Dense(1, activation='sigmoid'))

# the idea here is we will use data augmentation to train the model
In [132]:
vgg16_ffnn_da_model.compile(optimizer=Adam(learning_rate=0.001),
                    loss='binary_crossentropy',
                    metrics=['accuracy'])
vgg16_ffnn_da_model.summary()
Model: "sequential_4"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ vgg16 (Functional)              │ (None, 6, 6, 512)      │    14,714,688 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten_4 (Flatten)             │ (None, 18432)          │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_11 (Dense)                │ (None, 128)            │     2,359,424 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_4 (Dropout)             │ (None, 128)            │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_12 (Dense)                │ (None, 64)             │         8,256 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_13 (Dense)                │ (None, 1)              │            65 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 17,082,433 (65.16 MB)
 Trainable params: 2,367,745 (9.03 MB)
 Non-trainable params: 14,714,688 (56.13 MB)
In [135]:
from tensorflow.keras.callbacks import EarlyStopping

# since we are going to use data augumentation we will define a new ImageDataGenerator with some data augmentation techniques
dataGenAugmented = ImageDataGenerator(
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.15,
    zoom_range=0.15,
    horizontal_flip=True,
    fill_mode='nearest'
)

# also lets define a early stopping callback to prevent overfitting

early_stopping = EarlyStopping(
    monitor='val_loss',  # Monitor validation loss
    patience=5,         # Stop if no improvement for 15 epochs
    mode='min',          # Minimize the validation loss
    verbose=1,
    restore_best_weights=True  # Restore best weights found during training
)
In [139]:
# Fit the vgg16_ffnn_da_model  using normalized RGB images
history_vgg16_ffnn_da_model = vgg16_ffnn_da_model.fit(trainDataGen.flow(x_train_normalized_rgb, y_train_rgb, batch_size=32,shuffle=False),
    epochs=200, # increased epochs since we have early stopping
    callbacks=[early_stopping],
    validation_data=(x_val_normalized_rgb, y_val_rgb),
    verbose=2
)
Epoch 1/200
12/12 - 24s - 2s/step - accuracy: 0.9974 - loss: 0.0129 - val_accuracy: 1.0000 - val_loss: 4.1607e-04
Epoch 2/200
12/12 - 23s - 2s/step - accuracy: 0.9947 - loss: 0.0171 - val_accuracy: 1.0000 - val_loss: 3.4274e-04
Epoch 3/200
12/12 - 24s - 2s/step - accuracy: 1.0000 - loss: 0.0024 - val_accuracy: 1.0000 - val_loss: 9.1997e-04
Epoch 4/200
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 0.0012 - val_accuracy: 1.0000 - val_loss: 5.7209e-04
Epoch 5/200
12/12 - 24s - 2s/step - accuracy: 1.0000 - loss: 3.5766e-04 - val_accuracy: 1.0000 - val_loss: 1.6499e-04
Epoch 5: early stopping
Restoring model weights from the end of the best epoch: 1.
In [140]:
plot_training_history(history_vgg16_ffnn_da_model)
No description has been provided for this image
In [141]:
# performance classification on validation set
performance_val_vgg16_ffnn_da = model_performance_classification(vgg16_ffnn_da_model, x_val_normalized_rgb, y_val_rgb)
print("Performance of VGG16 FFNN Model with data augumentation on Val set of RGB Images:")
print(performance_val_vgg16_ffnn_da)
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 1s/step
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 1s/step
Performance of VGG16 FFNN Model with data augumentation on Val set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
Performance of VGG16 FFNN Model with data augumentation on Val set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
In [142]:
# confusion matrix for validation set
plot_confusion_matrix(vgg16_ffnn_da_model, x_val_normalized_rgb, y_val_rgb, ml=True)
4/4 ━━━━━━━━━━━━━━━━━━━━ 7s 2s/step
4/4 ━━━━━━━━━━━━━━━━━━━━ 7s 2s/step
No description has been provided for this image
In [143]:
# performance classification on Test set
performance_test_vgg16_ffnn_da = model_performance_classification(vgg16_ffnn_da_model, x_test_normalized_rgb, y_test_rgb)
print("Performance of VGG16 FFNN Model with data augumentation on Test set of RGB Images:")
print(performance_test_vgg16_ffnn_da)
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 1s/step
Performance of VGG16 FFNN Model with data augumentation on Test set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 1s/step
Performance of VGG16 FFNN Model with data augumentation on Test set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
In [144]:
#confusion matrix for test set
plot_confusion_matrix(vgg16_ffnn_da_model, x_test_normalized_rgb, y_test_rgb, ml=True)
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 1s/step
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 1s/step
No description has been provided for this image
In [153]:
performance_train_vgg16_ffnn_da = model_performance_classification(vgg16_ffnn_da_model, x_train_normalized_rgb, y_train_rgb)
print("Performance of Basic VGG Model with FFNN and Data Augumentation on Training set of RGB Images:")
print(performance_train_vgg16_ffnn_da)
12/12 ━━━━━━━━━━━━━━━━━━━━ 18s 1s/step
12/12 ━━━━━━━━━━━━━━━━━━━━ 18s 1s/step
Performance of Basic VGG Model with FFNN and Data Augumentation on Training set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
Performance of Basic VGG Model with FFNN and Data Augumentation on Training set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
In [154]:
plot_confusion_matrix(vgg16_ffnn_da_model, x_train_normalized_rgb, y_train_rgb, ml=True)
12/12 ━━━━━━━━━━━━━━━━━━━━ 17s 1s/step
12/12 ━━━━━━━━━━━━━━━━━━━━ 17s 1s/step
No description has been provided for this image

Visualizing the predictions¶

In [145]:
print("sample visulization of predictions on Validation set ")

plot_sample_predictions_on_val_set(vgg16_ffnn_da_model)
print("sample visulization of predictions on test set ")
plot_sample_predictions_on_test_set(vgg16_ffnn_da_model)
sample visulization of predictions on Validation set 
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 384ms/step
No description has been provided for this image
sample visulization of predictions on test set 
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 437ms/step
No description has been provided for this image
Observations¶
  • This Model also performed well and is on par with previous models in performance
  • The training was faster because of the usage of early stopping we provided , even though epcohs are high (stopped at 5 epochs)
  • Data Augumentation can be a future option once a mode diverse images are colelcted , but without the Data Augumentation also the elarier models have high accuracy scores

Model Performance Comparison and Final Model Selection¶

In [163]:
# load all performance results into a dataframe for comparison
performance_comparison = pd.DataFrame({
    'Model': ['Basic CNN RGB', 'VGG16 RGB', 'VGG16 FFNN RGB', 'VGG16 FFNN DA RGB'],
    'Train Accuracy': [
        performance_train_basic_cnn['Accuracy'].values[0],
        performance_train_basic_vgg['Accuracy'].values[0],
        performance_train_vgg16_ffnn['Accuracy'].values[0],
        performance_train_vgg16_ffnn_da['Accuracy'].values[0]
    ],
    'Val Accuracy': [
        performance_val_basic_cnn['Accuracy'].values[0],
        performance_val_basic_vgg['Accuracy'].values[0],
        performance_val_vgg16_ffnn['Accuracy'].values[0],
        performance_val_vgg16_ffnn_da['Accuracy'].values[0]
    ],
    'Test Accuracy': [
        performance_test_basic_cnn['Accuracy'].values[0],
        performance_test_basic_vgg['Accuracy'].values[0],
        performance_test_vgg16_ffnn['Accuracy'].values[0],
        performance_test_vgg16_ffnn_da['Accuracy'].values[0]
    ]
})
In [164]:
# display the performance comparison
print("Performance Comparison of Different Models:")
print(performance_comparison)
Performance Comparison of Different Models:
               Model  Train Accuracy  Val Accuracy  Test Accuracy
0      Basic CNN RGB             1.0       0.97619       0.968504
1          VGG16 RGB             1.0       1.00000       1.000000
2     VGG16 FFNN RGB             1.0       1.00000       1.000000
3  VGG16 FFNN DA RGB             1.0       1.00000       1.000000

Test Performance¶

In [170]:
testPerformances = pd.concat([
    performance_test_basic_cnn.T,
    performance_test_basic_vgg.T,
    performance_test_vgg16_ffnn.T,
    performance_test_vgg16_ffnn_da.T
], axis=1)   

testPerformances.columns = ['Basic CNN RGB', 'VGG16 RGB', 'VGG16 FFNN RGB', 'VGG16 FFNN DA RGB']
print(testPerformances)
           Basic CNN RGB  VGG16 RGB  VGG16 FFNN RGB  VGG16 FFNN DA RGB
Accuracy        0.968504        1.0             1.0                1.0
Recall          0.968504        1.0             1.0                1.0
Precision       0.968504        1.0             1.0                1.0
F1 Score        0.968504        1.0             1.0                1.0

Actionable Insights & Recommendations¶

Recommendations¶

  • Deployment: The VGG model with FFNN (feed forward neural networks) vgg16_ffnn_model is suggested for the deployment as we did a trasnfer learning on a pretrained model.
  • option-2 Deployment: For this use case the base CNN model also performed well, in case of limited computing resources the base CNN model also works well as it is very small and can be run on CPU.

Actionable Insights¶

  • We need to collect more images preferbly negative cases of real workers in the field without helmets instead of closeup images and expand the data set to retrain the model with more images.

Project Info¶

  • High Model Performance: The VGG-16 based models achieved perfect or near-perfect accuracy on the test set. This indicates that the models are highly effective for the given dataset. The features learned by VGG-16 on ImageNet are highly transferable to this problem.
  • Transfer Learning: The pre-trained VGG-16 model, even without fine-tuning, performed exceptionally well. This highlights the power of transfer learning for computer vision tasks, especially when the dataset is small.
  • Data Quality: The dataset is small, hence all models performed well, including the base CNN without any pretrained model inclusion. The non-helmet images are closeups thats the reason the model learned faster
  • Data Augmentation: While data augmentation is a standard practice to prevent overfitting and improve generalization, the model without data augmentation already performed perfectly. With early stopping, the augmented data model training stopped very early. This suggests the original dataset might be relatively easy for the model to learn.

Power Ahead!